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ABSTRACT 


Sentiment Analysis (SA) is an ongoing field of research in text mining field. SA 
(sentiment analysis) is the computational treatment of opinions, sentiments and 
text. This s paper deals in a comprehensive overview of the recent updates in 
this field. Many recently proposed algorithms amend and various SA 
applications are investigated and presented briefly in this paper. The related 
fields to SA (transfer learning, emotion detection, and building resources) that 
attracted researchers recently are discussed. The main objective of this paper is 
to give nearly full image of SA techniques and the related fields with brief details. 
The main contributions in this paper include the sophisticated categorizations of 
a large number of recent articles and the illustration of the recent trend of 
research in the sentiment analysis and its related areas. 
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Introduction: 

Sentiment Analysis (SA) or Opinion Mining (OM) is the 
computational study of people's opinions, attitudes and 
emotions toward an entity or topic. The entity can represent 
individuals emotion's and views towards an any topic. These 
topics are most likely to be covered by reviews. The two 
expressions sentiment analysis and opinion mining are often 
interchangeable. However, some researchers stated that OM 
and SA have slightly different notions [1]. Opinion Mining 
extracts and analyzes people's opinion about an topic while 
Sentiment Analysis identifies the sentiment expressed in a 
text then analyzes it. Therefore, the target of SA is to find 
opinions, identify the sentiments they express, and then 



Figurel. Sentiment analysis process on product reviews 


Sentiment Analysis can be considered a classification 
process as illustrated in Figure. 1. There are three main 
classification levels in SA: 

1. Document-level Sentiment Analysis 

2. Sentence-level Sentiment Analysis 

3. Aspect-level Sentiment Analysis 

> Document-level SA 

aims to classify an opinion document as expressing a 
positive or negative opinion or sentiment. It considers the 
whole document a basic information unit (talking about 
one topic). 

> Sentence-level 

SA aim to classify sentiment expressed in each sentence. 
The first step is to identify whether the sentence is 
subjective or objective. If the sentence is subjective, 
Sentence-level SA will determine whether the sentence 
expresses positive, negative or neutral opinion. 

> Aspect-level 

SA is based on the idea that opinion consists of 
sentiment and target opinion. 

Existing System: 

1. Study the text features of social media messages in the 
context of developing methods for their sentiment analysis. 
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2. Develop a method for automatic sentiment analysis of 
Twitter messages. 

Opinion Mining and Sentiment Analysis, sentiment analysis 
involves automatic analysis of opinions and emotive lexicons 
expressed in a text. In the analysis of a text tonality, it is 
considered that text information on the Internet is divided 
into two classes: facts and opinions. The definition of an 
opinion is a key concept. Opinions are divided into two 
types: 

1. Simple opinion. 

2. Comparison 

A simple opinion contains the statement of an author about 
one entity. It can be stated directly: “I was pleasantly 
surprised with the furniture assembly quality", or implicitly: 
"After the treatment, my health became stronger". In both 
cases, a simple opinion usually has a positive or negative 
sentiment. In the analysis of the tonality of a text, the 
following formal definition is given for the first type of 
opinion: a tuple of five elements (entity, feature, sentiment 
value, holder, time) is called a simple opinion, where entity is 
the object about whose aspect (feature) the author (holder) 
made an opinion at the time (time). 

.There are 3 types of emotions (sentiment value): positive, 
negative and neutral. Neutral emotion means that the text 
does not contain an emotional component. Entity is a person, 
organization, event, product or topic of discussion. 
Therefore, in various publications, entity is also called object 
or topic. Often, an entity can be represented as a hierarchical 
tree of components and sub-components. 


PROPOSED SYSTEM 



Following are the features Included 


Step 1: Data Collection: 

Collect the data from any social website. Data used in this 
study are online product reviews collected from Twitter. 
Experiments for both sentence-level categorization and 
review-level categorization are performed with promising 
outcomes. At last, we also give insight into our future work 
on sentiment analysis. 

Step 2: Data Pre-processing: 

It removes all unnecessary tweets like re-tweets, replies and 
also tweets which are not expressing any emotions. Stop 
words removal, and the entire thing which is implemented in 
our base paper "student learning". 

Step 3: Feature Extraction: 

Here we will try different combinations of features like Uni¬ 
grams, POS tagging, twitter specific features etc.. Every word 
of a sentence has its syntactic role that defines how the word 
is used. The syntactic roles are also known as the parts of 
speech. There are 8 parts of speech in English: the verb, the 
noun, the pronoun, the adjective, the adverb, the preposition, 
the conjunction, and the interjection. In natural language 
processing, part-of-speech (POS) taggers have been 


developed to classify words based on their parts of speech. 
For sentiment analysis, a POS tagger is very useful because of 
the following two reasons: 

1) Words like nouns and pronouns usually do not contain 
any sentiment. It is able to filter out such words with 
the help of a POS tagger. 

2) A POS tagger can also be used to distinguish words that 
can be used in different parts of speech. For instance, as 
a verb, "enhanced" may conduct different amount of 
sentiment as being of an adj ective. The POS tagger used 
for this research is a max-entropy POS tagger 
developed for the Penn Treebank Project. The tagger is 
able to provide 46 different tags indicating that it can 
identify more detailed syntactic roles than only 8. As an 
example, Table 1 is a list of all tags for verbs that has 
been included in the POS tagger. 

Step 4: Feature selection: 

Now we would select the best features. We propose a set of 
features listed in Table 4 for our experiments. These are a 
total of 50 type of features. We calculate these features for 
the whole tweet and for the last one-third of the tweet. In 
total we get 100 additional features. We refer to these 
features as Sent-features throughout the paper. Our features 
can be divided into three broad categories: 

Firstly that are primarily counts of various features and 
therefore the value of the feature is a natural number E N. 

Second, features whose value is a real number E R. These are 
primarily features that capture the score retrieved from DAL. 

Thirdly, features whose values are Boolean E B. These are 
bag of words, presence of exclamation marks and capitalized 
text. 

Each of these broad categories is divided into two 
subcategories: Polar features and Non-polar features. We 
refer to a feature as polar if we calculate its prior polarity 
either by looking it up in DAL (extended through Word Net) 
or in the emoticon dictionary. All other features which are 
not associated with any prior polarity fall in the Non-polar 
category. Each of Polar and Non-polar features is further 
subdivided into two categories: POS and Other. POS refers to 
features that capture statistics about parts-of-speech of 
words and other refers to all other types of features. 

Step 5: Classification 

Here we will compare Naive Bayes .The Naive Bayesian 
classifier works as follows: Suppose that there exist a set of 
training data, D, in which each tuple is represented by an n- 
dimensional feature vector, X=x± ,X2,.., x n , indicating?? 
measurements made on the tuple from n attributes or 
features. Assume that there are m classes, Ci, C 2 ,...,C m . Given 
a tuple A, the classifier will predict that A belongs to C/if 
and only if: P (C/|X) >P ( Cj\X ), where i,je[l,m\ and itj. P[Ci\X) 
is computed as: 

P(Ci|X)=]"lk= InP (xk|Ci) 

Step 6: Comparison of Results: 

As the last step we will compare the results. 

2.1 Comparison of models for this task the unigram model 
achieves a gain of 23.25% over chance baseline. Table 8 
compares the performance of our three models. We report 
mean and standard deviation of 5-fold test accuracy. We 
observe that the tree kernels outperform the unigram and 
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the senti-features model by 4.02% and 4.29% absolute, 
respectively. We note that this difference is much more 
pronounced comparing to the two way classification task. 
Once again, our 100 senti-features perform almost as well as 
the unigram baseline which has about 13,000 features. We 
also experiment with the combination of models. For this 
classification task the combination of tree kernel Senti- 
features outperforms the combination of unigrams with 
Senti-features. 

DESIGN PHASE 



Opinion mining applications are the basic infrastructure of 
large scale collaborative policymaking. They help making 
sense of thousands of interventions. They help to detect 
early warning system of possible disruption in a timely 
manner, by detecting early feedback from citizens. 
Traditionally, ad hoc surveys are used to collect feedback in 
a structured manner. However, this kind of data collection is 
expensive, as it deserves an investment in design and data 
collection; it is difficult, as people are not interested in 
answering surveys; and ultimately it is not very valuable, as 
it detects "known problems" through pre-defined questions 
and interviewees, but fails to detect the most important 
problems, the famous "unknown unknown". 

There is a lot of scope in analyzing the video and images on 
the web. Nowadays, with the advent of Facebook, Instagram 
and Video vines people are expressing their thoughts with 
pictures and videos along with text. Sentiment analysis will 
have to pace up with this change. Tools which are helping 
companies to change strategies based on Face-book and 
Twitter will also have to accommodate the number of likes 
and re-tweets that the thought is generating on the Social 
media. People follow and unfollow people and comments on 
Social Media but never comment so there is scope in 



Development Methodology: 

This is the output screen of our where you can see the latest 
trend topic to enter and see the tweets and to get opinion 
about the people. 



And we can enter any keywords to know or to check 
whether the people are aware or active or notNext step is to 
see the analysis and the name of the person from where he 
has tweeted. 



Another way to get an review is twitter world trend: 



In this world map we can select the any country to know 
about the trend topic in their country and to know about the 
public point of view. 

Conclusion 

There is a lot of scope in analyzing the video and images on 
the web. Nowadays, with the advent of Face-book, Instagram 
and Video vines people are expressing their thoughts with 
pictures and videos along with text. Sentiment analysis will 
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have to pace up with this change. Tools which are helping 
companies to change strategies based on Face-book and 
Twitter will also have to accommodate the number of likes 
and re-tweets that the thought is generating on the Social 
media. People follow and unfollow people and comments on 
Social Media but never comment so there is scope in 
analyzing these aspects of the Web as well. 
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